424 research outputs found

    Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.</p> <p>Methods</p> <p>Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.</p> <p>Results</p> <p>Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.</p> <p>Conclusion</p> <p>Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.</p

    A literature-based similarity metric for biological processes

    Get PDF
    BACKGROUND: Recent analyses in systems biology pursue the discovery of functional modules within the cell. Recognition of such modules requires the integrative analysis of genome-wide experimental data together with available functional schemes. In this line, methods to bridge the gap between the abstract definitions of cellular processes in current schemes and the interlinked nature of biological networks are required. RESULTS: This work explores the use of the scientific literature to establish potential relationships among cellular processes. To this end we haveused a document based similarity method to compute pair-wise similarities of the biological processes described in the Gene Ontology (GO). The method has been applied to the biological processes annotated for the Saccharomyces cerevisiae genome. We compared our results with similarities obtained with two ontology-based metrics, as well as with gene product annotation relationships. We show that the literature-based metric conserves most direct ontological relationships, while reveals biologically sounded similarities that are not obtained using ontology-based metrics and/or genome annotation. CONCLUSION: The scientific literature is a valuable source of information from which to compute similarities among biological processes. The associations discovered by literature analysis are a valuable complement to those encoded in existing functional schemes, and those that arise by genome annotation. These similarities can be used to conveniently map the interlinked structure of cellular processes in a particular organism

    Measurement of the inclusive and dijet cross-sections of b-jets in pp collisions at sqrt(s) = 7 TeV with the ATLAS detector

    Get PDF
    The inclusive and dijet production cross-sections have been measured for jets containing b-hadrons (b-jets) in proton-proton collisions at a centre-of-mass energy of sqrt(s) = 7 TeV, using the ATLAS detector at the LHC. The measurements use data corresponding to an integrated luminosity of 34 pb^-1. The b-jets are identified using either a lifetime-based method, where secondary decay vertices of b-hadrons in jets are reconstructed using information from the tracking detectors, or a muon-based method where the presence of a muon is used to identify semileptonic decays of b-hadrons inside jets. The inclusive b-jet cross-section is measured as a function of transverse momentum in the range 20 < pT < 400 GeV and rapidity in the range |y| < 2.1. The bbbar-dijet cross-section is measured as a function of the dijet invariant mass in the range 110 < m_jj < 760 GeV, the azimuthal angle difference between the two jets and the angular variable chi in two dijet mass regions. The results are compared with next-to-leading-order QCD predictions. Good agreement is observed between the measured cross-sections and the predictions obtained using POWHEG + Pythia. MC@NLO + Herwig shows good agreement with the measured bbbar-dijet cross-section. However, it does not reproduce the measured inclusive cross-section well, particularly for central b-jets with large transverse momenta.Comment: 10 pages plus author list (21 pages total), 8 figures, 1 table, final version published in European Physical Journal

    Observation of associated near-side and away-side long-range correlations in √sNN=5.02  TeV proton-lead collisions with the ATLAS detector

    Get PDF
    Two-particle correlations in relative azimuthal angle (Δϕ) and pseudorapidity (Δη) are measured in √sNN=5.02  TeV p+Pb collisions using the ATLAS detector at the LHC. The measurements are performed using approximately 1  μb-1 of data as a function of transverse momentum (pT) and the transverse energy (ΣETPb) summed over 3.1<η<4.9 in the direction of the Pb beam. The correlation function, constructed from charged particles, exhibits a long-range (2<|Δη|<5) “near-side” (Δϕ∼0) correlation that grows rapidly with increasing ΣETPb. A long-range “away-side” (Δϕ∼π) correlation, obtained by subtracting the expected contributions from recoiling dijets and other sources estimated using events with small ΣETPb, is found to match the near-side correlation in magnitude, shape (in Δη and Δϕ) and ΣETPb dependence. The resultant Δϕ correlation is approximately symmetric about π/2, and is consistent with a dominant cos⁡2Δϕ modulation for all ΣETPb ranges and particle pT

    Measurement of the cross-section of high transverse momentum vector bosons reconstructed as single jets and studies of jet substructure in pp collisions at √s = 7 TeV with the ATLAS detector

    Get PDF
    This paper presents a measurement of the cross-section for high transverse momentum W and Z bosons produced in pp collisions and decaying to all-hadronic final states. The data used in the analysis were recorded by the ATLAS detector at the CERN Large Hadron Collider at a centre-of-mass energy of √s = 7 TeV;{\rm Te}{\rm V}andcorrespondtoanintegratedluminosityof and correspond to an integrated luminosity of 4.6\;{\rm f}{{{\rm b}}^{-1}}.ThemeasurementisperformedbyreconstructingtheboostedWorZbosonsinsinglejets.ThereconstructedjetmassisusedtoidentifytheWandZbosons,andajetsubstructuremethodbasedonenergyclusterinformationinthejetcentreofmassframeisusedtosuppressthelargemultijetbackground.ThecrosssectionforeventswithahadronicallydecayingWorZboson,withtransversemomentum. The measurement is performed by reconstructing the boosted W or Z bosons in single jets. The reconstructed jet mass is used to identify the W and Z bosons, and a jet substructure method based on energy cluster information in the jet centre-of-mass frame is used to suppress the large multi-jet background. The cross-section for events with a hadronically decaying W or Z boson, with transverse momentum {{p}_{{\rm T}}}\gt 320\;{\rm Ge}{\rm V}andpseudorapidity and pseudorapidity |\eta |\lt 1.9,ismeasuredtobe, is measured to be {{\sigma }_{W+Z}}=8.5\pm 1.7$ pb and is compared to next-to-leading-order calculations. The selected events are further used to study jet grooming techniques

    Search for pair-produced long-lived neutral particles decaying to jets in the ATLAS hadronic calorimeter in ppcollisions at √s=8TeV

    Get PDF
    The ATLAS detector at the Large Hadron Collider at CERN is used to search for the decay of a scalar boson to a pair of long-lived particles, neutral under the Standard Model gauge group, in 20.3fb−1of data collected in proton–proton collisions at √s=8TeV. This search is sensitive to long-lived particles that decay to Standard Model particles producing jets at the outer edge of the ATLAS electromagnetic calorimeter or inside the hadronic calorimeter. No significant excess of events is observed. Limits are reported on the product of the scalar boson production cross section times branching ratio into long-lived neutral particles as a function of the proper lifetime of the particles. Limits are reported for boson masses from 100 GeVto 900 GeV, and a long-lived neutral particle mass from 10 GeVto 150 GeV

    Search for R-parity-violating supersymmetry in events with four or more leptons in sqrt(s) =7 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for new phenomena in final states with four or more leptons (electrons or muons) is presented. The analysis is based on 4.7 fb−1 of s=7  TeV \sqrt{s}=7\;\mathrm{TeV} proton-proton collisions delivered by the Large Hadron Collider and recorded with the ATLAS detector. Observations are consistent with Standard Model expectations in two signal regions: one that requires moderate values of missing transverse momentum and another that requires large effective mass. The results are interpreted in a simplified model of R-parity-violating supersymmetry in which a 95% CL exclusion region is set for charged wino masses up to 540 GeV. In an R-parity-violating MSUGRA/CMSSM model, values of m 1/2 up to 820 GeV are excluded for 10 < tan β < 40

    Search for direct pair production of the top squark in all-hadronic final states in proton-proton collisions at s√=8 TeV with the ATLAS detector

    Get PDF
    The results of a search for direct pair production of the scalar partner to the top quark using an integrated luminosity of 20.1fb−1 of proton–proton collision data at √s = 8 TeV recorded with the ATLAS detector at the LHC are reported. The top squark is assumed to decay via t˜→tχ˜01 or t˜→ bχ˜±1 →bW(∗)χ˜01 , where χ˜01 (χ˜±1 ) denotes the lightest neutralino (chargino) in supersymmetric models. The search targets a fully-hadronic final state in events with four or more jets and large missing transverse momentum. No significant excess over the Standard Model background prediction is observed, and exclusion limits are reported in terms of the top squark and neutralino masses and as a function of the branching fraction of t˜ → tχ˜01 . For a branching fraction of 100%, top squark masses in the range 270–645 GeV are excluded for χ˜01 masses below 30 GeV. For a branching fraction of 50% to either t˜ → tχ˜01 or t˜ → bχ˜±1 , and assuming the χ˜±1 mass to be twice the χ˜01 mass, top squark masses in the range 250–550 GeV are excluded for χ˜01 masses below 60 GeV

    Search for the neutral Higgs bosons of the minimal supersymmetric standard model in pp collisions at root s=7 TeV with the ATLAS detector

    Get PDF
    A search for neutral Higgs bosons of the Minimal Supersymmetric Standard Model (MSSM) is reported. The analysis is based on a sample of proton-proton collisions at a centre-of-mass energy of 7TeV recorded with the ATLAS detector at the Large Hadron Collider. The data were recorded in 2011 and correspond to an integrated luminosity of 4.7 fb-1 to 4.8 fb-1. Higgs boson decays into oppositely-charged muon or τ lepton pairs are considered for final states requiring either the presence or absence of b-jets. No statistically significant excess over the expected background is observed and exclusion limits at the 95% confidence level are derived. The exclusion limits are for the production cross-section of a generic neutral Higgs boson, φ, as a function of the Higgs boson mass and for h/A/H production in the MSSM as a function of the parameters mA and tan β in the mhmax scenario for mA in the range of 90GeV to 500 GeV. Copyright CERN
    corecore